mutation probability
Searching the Search Space of Vision Transformer-- -- Supplementary Material-- -- Minghao Chen
The details include: Searching in the searched space. Q-K -V dimension could be smaller than the embedding dimension. In this section, we present the details of supernet training and evolutionary algorithm. At last, we update the corresponding weights with the fused gradients. Alg. 2 shows the evolution search in our method.
- Oceania > Australia > New South Wales > Sydney (0.05)
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- Asia (0.05)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.42)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.42)
Searching the Search Space of Vision Transformer-- -- Supplementary Material-- -- Minghao Chen
The details include: Searching in the searched space. Q-K -V dimension could be smaller than the embedding dimension. In this section, we present the details of supernet training and evolutionary algorithm. At last, we update the corresponding weights with the fused gradients. Alg. 2 shows the evolution search in our method.
- Oceania > Australia > New South Wales > Sydney (0.05)
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- Asia (0.05)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.42)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.42)
Multi-SpaCE: Multi-Objective Subsequence-based Sparse Counterfactual Explanations for Multivariate Time Series Classification
Deep Learning systems excel in complex tasks but often lack transparency, limiting their use in critical applications. Counterfactual explanations, a core tool within eXplainable Artificial Intelligence (XAI), offer insights into model decisions by identifying minimal changes to an input to alter its predicted outcome. However, existing methods for time series data are limited by univariate assumptions, rigid constraints on modifications, or lack of validity guarantees. This paper introduces Multi-SpaCE, a multi-objective counterfactual explanation method for multivariate time series. Using non-dominated ranking genetic algorithm II (NSGA-II), Multi-SpaCE balances proximity, sparsity, plausibility, and contiguity. Unlike most methods, it ensures perfect validity, supports multivariate data and provides a Pareto front of solutions, enabling flexibility to different end-user needs. Comprehensive experiments in diverse datasets demonstrate the ability of Multi-SpaCE to consistently achieve perfect validity and deliver superior performance compared to existing methods.
Onsite Job Scheduling by Adaptive Genetic Algorithm
Basak, Avijit, Acharya, Subhas
Onsite Job Scheduling is a specialized variant of Vehicle Routing Problem (VRP) with multiple depots. The objective of this problem is to execute jobs requested by customers, belonging to different geographic locations by a limited number of technicians, with minimum travel and overtime of technicians. Each job is expected to be completed within a specified time limit according to the service level agreement with customers. Each technician is assumed to start from a base location, serve several customers and return to the starting place. Technicians are allotted jobs based on their skill sets, expertise levels of each skill and availability slots. Although there are considerable number of literatures on VRP we do not see any explicit work related to Onsite Job Scheduling. In this paper we have proposed an Adaptive Genetic Algorithm to solve the scheduling problem. We found an optimized travel route for a substantial number of jobs and technicians, minimizing travel distance, overtime duration as well as meeting constraints related to SLA.
- Asia > India (0.05)
- North America > United States > Tennessee > Knox County > Knoxville (0.04)
- North America > United States > North Carolina > New Hanover County > Wilmington (0.04)
- (3 more...)
Broadcasting in random recursive dags
Briend, Simon, Devroye, Luc, Lugosi, Gabor
A uniform $k$-{\sc dag} generalizes the uniform random recursive tree by picking $k$ parents uniformly at random from the existing nodes. It starts with $k$ ''roots''. Each of the $k$ roots is assigned a bit. These bits are propagated by a noisy channel. The parents' bits are flipped with probability $p$, and a majority vote is taken. When all nodes have received their bits, the $k$-{\sc dag} is shown without identifying the roots. The goal is to estimate the majority bit among the roots. We identify the threshold for $p$ as a function of $k$ below which the majority rule among all nodes yields an error $c+o(1)$ with $c<1/2$. Above the threshold the majority rule errs with probability $1/2+o(1)$.
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (2 more...)
Adaptive Parameters Methods for Machine Learning
In this post, I will discuss the ideas behind adaptive parameters methods for machine learning and why and when to implement them as some practical examples using python. Adaptive methods (also known as parameter scheduling) refer to strategies to update some model parameters at training time using a schedule. This change will depend on the model's state at time t; for example, you can update them depending on the loss value, the number of iterations/epochs, elapsed training time, etc. For example, in general, for neural networks, the choice of the learning rate has several consequences; if the learning rate is too large, it may overshoot the minimum; if it's too small, it may take it too long to converge, or it might get stuck on a local minimum. In this scenario, we choose to change the learning rate as a function of the epochs; this way, you may set a large rate at the beginning of the training, and conforming to the epochs increases; you can decrease the value until you reach a lower threshold.
Learning to Communicate with Strangers via Channel Randomisation Methods
We introduce two methods for improving the performance of agents meeting for the first time to accomplish a communicative task. The methods are: (1) `message mutation' during the generation of the communication protocol; and (2) random permutations of the communication channel. These proposals are tested using a simple two-player game involving a `teacher' who generates a communication protocol and sends a message, and a `student' who interprets the message. After training multiple agents via self-play we analyse the performance of these agents when they are matched with a stranger, i.e. their zero-shot communication performance. We find that both message mutation and channel permutation positively influence performance, and we discuss their effects.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (3 more...)
- Education (0.46)
- Leisure & Entertainment > Games (0.46)
A Rank based Adaptive Mutation in Genetic Algorithm
Traditionally Genetic Algorithm has been used for optimization of unimodal and multimodal functions. Earlier researchers worked with constant probabilities of GA control operators like crossover, mutation etc. for tuning the optimization in specific domains. Recent advancements in this field witnessed adaptive approach in probability determination. In Adaptive mutation primarily poor individuals are utilized to explore state space, so mutation probability is usually generated proportionally to the difference between fitness of best chromosome and itself (fMAX - f). However, this approach is susceptible to nature of fitness distribution during optimization. This paper presents an alternate approach of mutation probability generation using chromosome rank to avoid any susceptibility to fitness distribution. Experiments are done to compare results of simple genetic algorithm (SGA) with constant mutation probability and adaptive approaches within a limited resource constraint for unimodal, multimodal functions and Travelling Salesman Problem (TSP). Measurements are done for average best fitness, number of generations evolved and percentage of global optimum achievements out of several trials. The results demonstrate that the rank-based adaptive mutation approach is superior to fitness-based adaptive approach as well as SGA in a multimodal problem space.
- North America > United States > Michigan (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Maryland (0.04)
- (4 more...)
Adaptive Mutation in Genetic Algorithm with Python Examples - neptune.ai
The genetic algorithm is a popular evolutionary algorithm. It uses Darwin's theory of natural evolution to solve complex problems in computer science. But, to do so, the algorithm's parameters need a bit of adjusting. One of the key parameters is mutation. It makes random changes in the chromosomes (i.e.
A Tailored NSGA-III Instantiation for Flexible Job Shop Scheduling
Wang, Yali, van Stein, Bas, Emmerich, Michael T. M., Bäck, Thomas
A customized multi-objective evolutionary algorithm (MOEA) is proposed for the multi-objective flexible job shop scheduling problem (FJSP). It uses smart initialization approaches to enrich the first generated population, and proposes various crossover operators to create a better diversity of offspring. Especially, the MIP-EGO configurator, which can tune algorithm parameters, is adopted to automatically tune operator probabilities. Furthermore, different local search strategies are employed to explore the neighborhood for better solutions. In general, the algorithm enhancement strategy can be integrated with any standard EMO algorithm. In this paper, it has been combined with NSGA-III to solve benchmark multi-objective FJSPs, whereas an off-the-shelf implementation of NSGA-III is not capable of solving the FJSP. The experimental results show excellent performance with less computing budget.